% Copyright 2016 Sandia Corporation. Under the terms of Contract DE-AC04-94AL85000 with Sandia
% Corporation, the U.S. Government retains certain rights in this software

% Redistribution and use in source and binary forms, with or without
% modification, are permitted provided that the following conditions are
% met:
% 
%     (1) Redistributions of source code must retain the above copyright
%     notice, this list of conditions and the following disclaimer. 
% 
%     (2) Redistributions in binary form must reproduce the above copyright
%     notice, this list of conditions and the following disclaimer in
%     the documentation and/or other materials provided with the
%     distribution.  
%     
%     (3)The name of the author may not be used to
%     endorse or promote products derived from this software without
%     specific prior written permission.
% 
% THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
% IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
% WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
% DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT,
% INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
% (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
% SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
% HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
% STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
% IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
% POSSIBILITY OF SUCH DAMAGE.

function decontaminatedD = despikeHyperUBSiteration(D,maxComponents,readoutSigma,maxPixelFraction,originalD)

%FUNCTION decontaminatedD = despikeHyperUBSiteration(D,maxComponents,readoutSigma,maxPixelFraction,originalD)
%
% PURPOSE:
%   despikeHyperUBSiteration is designed to take spectral data and remove 
%	cosmic spikes. Multiple iterations of this code are recommended. Each 
%	iteration attenuates cosmic spikes, but multiple cycles are often 
%	needed to reduce cosmic spikes to the typical noise-level variations. 
%   despikeHyperUBSiteration is designed for hyperspectral data. It 
%	assumes that many spectra (pixels) have been recorded, and that each 
%	spectral component of interest is present in multiple pixels. 
%
% DEPENDENCIES:
%	(eliminateUniquePixels)
%
% CLASSES:
%	-None-
%
% INPUTS:
%   D:
%       An m by n matrix, where each of the m rows corresponds to a 
%       spectrum. 
%   maxComponents: 
%       The maximum number of expected spectral components. 
%   readoutSigma:
%       The expected readout noise standard deviation, typically the
%       readout noise of the camera. 
%	maxPixelFraction:
%		Pixels which are responsible for greater than this fraction of the
%		total weight (positive or negative portions) of a spectral
%		component must correspond to spikes. The amount of that spectral
%		component in that pixel is then set to zero. 
%	originalD:
%		For the first iteration, this is equivalent to D. For later
%		iterations, D will be the decontaminated result from the last round
%		while originalD will be the original, unprocessed matrix. 
%
% OUTPUTS:
%   decontaminatedD:
%       A matrix equivalent to D, but with cosmic spikes attenuated. 
%
% REFERENCES:
%   This code was originally based off the following reference for the
%   UBS-DM algorithm. 
%   1)  Zhang, D. M. and D. Ben-Amotz (2002). "Removal of cosmic spikes 
%       from hyper-spectral images using a hybrid upper-bound spectrum 
%       method." Applied Spectroscopy 56(1): 91-98.
%	This code can be used to run the UBS-DM algorithm (with very minor 
%	modifications) by setting maxPixelFraction=1. However, the code is
%	actually an implementation of the UBS-DM-HS algorithm, which is more
%	robust and generally performs better. The UBS-DM-HS algorithm is
%	documented in the following reference:
%	2)	Anthony, S.M and Timlin, J.A. (2016). "Removing Cosmic Spikes 
%		Using a Hyperspectral Upper-Bound Spectrum Method." Applied
%		Spectroscopy
%
% PROGRAMMER COMMENTS:
%   Multiple iterations of this code are recommended. Each iteration 
%	attenuates cosmic spikes, but multiple cycles are often cycles are 
%	often needed before the algorithm converges on a stable result.
%	Hopefully, at this point the cosmic spikes have been reduced to the 
%	typical noise-level variations. 
%
%	When performing one iteration, D and originalD are the same. When
%	performing multiple iterations, D should be the decontaminatedD from
%	the last iteration but originalD should always be the original data
%	before any iterations were performed. 
%
% LIMITATIONS:
%	1) despikeHyperUBSiteration is designed for hyperspectral data. It 
%		assumes that many spectra (pixels) have been recorded, and that 
%		each spectral component of interest is present in multiple pixels. 
%	2) The elimination of components from pixels that dominate that
%		component can result in reductions below baseline. The data should
%		be baseline corrected as much as possible first. 

% Revision:
%	Sandia Hyper-UBS 1.0.0

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%Supply a default. 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%If only one iteration is being performed, these two inputs are equivalent,
%so the second input is unnecessary. Use this equivalence as the default if
%the final input is not provided. 
if ~exist('originalD','var')
	originalD = D;
end
	

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%Perform singular value decomposition and related processing
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%The maximum number of group I eigenvectors should be significantly larger
%than the maximum number of expected components
maxEigenvectors = 10*maxComponents;

%"First perform the singular value decomposition of the data matrix D."
[U,Sigma,V] = svd(D,'econ');
%Transpose to match notation of paper
V = V';

%Score matrix
S = U*Sigma;

%The eigenvalue vector
E = diag(Sigma).^2;

%Cosmic spike probability factor. Note that reference 1 incorrectly
%specifies taking the maximum along the rows of S, while the explanatory
%text makes clear that S_i should actually be the columns of S. 
tmp = S.^2;
F = max(tmp)./sum(tmp,1);

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%Initialize
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

nVectors    = numel(E);
groupI      = false(nVectors,1);
groupIIa    = false(nVectors,1);
groupIIb    = false(nVectors,1);
nGroupI     = 0;
eigenSum    = 0;

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%Determine which eigenvectors should be assigned to each group
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%Any eigenvectors with F>.25 are to be excluded from group I
possGroupI = F<.25;

%Iterate until we have enough values in group I
j=1;
while j<=nVectors
    %Any eigenvectors with F>.25 are to be excluded from group I, but if we
    %are not yet done with group I, it should be assigned to group 2
    if possGroupI(j)
        groupI(j)   = true;
        nGroupI     = nGroupI+1;
        eigenSum    = eigenSum + E(j);
        if nGroupI >= maxEigenvectors
            %We have found the maximum number of eigenvectors for Group I
            break
        elseif eigenSum/sum(E) > .995
            %The accumulated eigenvalue percentage is over 95.5 percent. 
            break
        end
    else
        groupIIa(j) = true;
		E(j) = 0; 
    end
    j=j+1;
end

%Assign up to the the next 40 eigenvectors to group IIb, where the actual
%number assigned is the lesser of 40 and the number of eigenvectors
%remaining. 
startIIB = j;
endIIb = min(j+39,nVectors);
groupIIb(startIIB:endIIb)=true;
nVectorsUsed = endIIb;

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%Extract the eigenvectors to be used and corresponding variables
%
%Filter the group II eigenvectors using either 5 or 7 pixel median filter
%algorithms
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Vnew        = V(1:nVectorsUsed,:);
Snew        = S(:,1:nVectorsUsed);
groupIIa    = groupIIa(1:nVectorsUsed);
groupIIb    = groupIIb(1:nVectorsUsed);

%For group IIa, use a 7 point median filter
VIIa = Vnew(groupIIa,:);
VIIa = medfilt2(VIIa, [1 7],'symmetric');
Vnew(groupIIa,:) = VIIa;

%For group IIb, use a 5 point median filter. Group IIa are more likely to
%contain spikes of multiple pixel width, hence the larger window for group
%IIa
VIIb = Vnew(groupIIb,:);
VIIb = medfilt2(VIIb, [1 5],'symmetric');
Vnew(groupIIb,:) = VIIb;

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%The following is a departure from simply implementing the referenced
%paper. 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Snew = eliminateUniquePixels(Snew,maxPixelFraction);

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%Generate the reconstructed and the upper bound spectrum data matrices 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

reconstructedD = Snew * Vnew;
reconstructedD = max(reconstructedD,0);

Dubs = reconstructedD + 4*sqrt(reconstructedD + readoutSigma.^2);

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%Generate the decontaminated (cosmic-spike free) data matrix
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

decontaminatedD = originalD;
replace = originalD>Dubs;
decontaminatedD(replace) = reconstructedD(replace);


function Snew = eliminateUniquePixels(Snew,maxPixelFraction)

%FUNCTION Snew = eliminateUniquePixels(Snew,maxPixelFraction)
%
% PURPOSE: 
%	eliminateUniquePixels is designed to determine whether an excessive
%	fraction of any eigenvector is localized to any individual pixel. Since
%	real spectral components are present in many pixels, eigenvectors
%	dominated by a few pixels are almost certainly cosmic spikes, where the
%	pixels which contain significant contributions have cosmic spikes in
%	that location. Zeroing the contribution of such eigenvectors in those
%	pixels eliminates the cosmic spikes. The contributions in other pixels
%	are left, as they are minimal and may correspond to standard noise. 
%
% DEPENDENCIES:
%	-None-
%
% CLASSES:
%	-None-
%
% INPUTS:
%	Snew:
%		A score value for each pixel for each of the eigenvectors. Each
%		column corresponds to one eigenvector, while each row corresponds
%		to one pixel. 
%	maxPixelFraction:
%		Pixels which are responsible for greater than this fraction of the
%		total weight (positive or negative portions) of a spectral
%		component must correspond to spikes. The amount of that spectral
%		component in that pixel is then set to zero. 
%
% OUTPUTS:
%	Snew:
%		A score value for each pixel for each of the eigenvectors. Each
%		column corresponds to one eigenvector, while each row corresponds
%		to one pixel. Any entries in Snew which corresponded to excessive
%		contribution from a single pixel were zeroed. 
%
% REFERENCES:
%	-None-
%
% PROGRAMMER COMMENTS:
%	-None-
%
% LIMITATIONS:
%	-None-

% Revision History:
%   1.0.0, 2015-11-09, Stephen M. Anthony: Documented. 
%	1.1.0, 2015-12-07, Stephen M. Anthony: Streamlined. 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%Create positive and negative subsets
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

sPositive = max(Snew,0);
sNegative = min(Snew,0);

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%Find the total amount of positive and negative
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

totalPositive = sum(sPositive);
totalNegative = sum(sNegative);

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%Find any pixels which are responsible for greater than maxPixelFraction 
%of the total amount of a given sign of a compoinent. 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

positiveFraction = sPositive./(ones(size(Snew,1),1)*totalPositive);
negativeFraction = sNegative./(ones(size(Snew,1),1)*totalNegative);

excessiveContribution = positiveFraction>maxPixelFraction | ...
	negativeFraction>maxPixelFraction;

%Any pixels which are responsible for greater than the specified fraction 
%of the total amount of a given sign of a component must be contaminated. 
%Any real spectral components should be distributed through multiple 
%pixels, not concentrated in just a few pixels. 
Snew(excessiveContribution) = 0;
